Separating Structure from Interestingness
نویسنده
چکیده
Condensed representations of pattern collections have been recognized to be important building blocks of inductive databases, a promising theoretical framework for data mining, and recently they have been studied actively. However, there has not been much research on how condensed representations should actually be represented. In this paper we propose a general approach to build condensed representations of pattern collections. The approach is based on separating the structure of the pattern collection from the interestingness values of the patterns. We study also the concrete case of representing the frequent sets and their (approximate) frequencies following this approach: we discuss the trade-offs in representing the frequent sets by the maximal frequent sets, the minimal infrequent sets and their combinations, and investigate the problem approximating the frequencies from samples by giving new upper bounds on sample complexity based on frequent closed sets and describing how convex optimization can be used to improve and score the obtained samples.
منابع مشابه
An Efficient Algorithm for Mining Sequential Rules with Interestingness Measures
Mining sequential rules are an important problem in data mining research. It is commonly used for market decisions, management and behaviour analysis. In traditional association-rule mining, rule interestingness measures such as confidence are used for determining relevant knowledge. They can reduce the size of the search space and select useful or interesting rules from the set of the discover...
متن کاملA Graph-based Clustering Approach to Evaluate Interestingness Measures: A Tool and a Comparative Study
Finding interestingness measures to evaluate association rules has become an important knowledge quality issue in KDD. Many interestingness measures may be found in the literature, and many authors have discussed and compared interestingness properties in order to improve the choice of the most suitable measures for a given application. As interestingness depends both on the data structure and ...
متن کاملARQAT: An Exploratory Analysis Tool For Interestingness Measures
Finding interestingness measures to evaluate association rules has become an important knowledge quality issue in KDD. Many interestingness measures may be found in the literature, and many authors have discussed and compared interestingness properties in order to help choose the best measures for a given application. As interestingness depends both on the data structure and on the decision-mak...
متن کاملSemantic interestingness measures for discovering association rules in the skeletal dysplasia domain
BACKGROUND Lately, ontologies have become a fundamental building block in the process of formalising and storing complex biomedical information. With the currently existing wealth of formalised knowledge, the ability to discover implicit relationships between different ontological concepts becomes particularly important. One of the most widely used methods to achieve this is association rule mi...
متن کاملRelative Measure for Mining Interesting Rules
This paper presents a measure which estimates interestingness of a rule relative to its corresponding common sense rules. Mining interesting rules is one of the important data mining tasks. Interesting rules bring novel knowledge that helps decision makers for advantageous actions. Interestingness is a relative issue. It is relative with what is known about the domain. A measure which can estim...
متن کامل